Power law relating 10.7 cm flux to sunspot number 

Robert W. Johnson 
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Abstract To investigate the relation between observa- 
tions of the 10.7 cm flux and the international sunspot 
number so that a physical unit may be ascribed to his- 
torical records, both polynomial and power law mod- 
els are developed giving the radio flux as a function of 
sunspot number and vice versa. Bayesian data anal- 
ysis is used to estimate the model parameters and to 
discriminate between the models. The effect on the pa- 
rameter uncertainty and on the relative evidence of nor- 
malizing the measure of fit is investigated. The power 
law giving flux as a function of sunspot number is found 
to be the most plausible model and may be used to es- 
timate the radio flux from historical sunspot observa- 
tions. 

Keywords 10.7 cm flux, sunspot number, solar mag- 
netic activity 



1 Introduction 

That a relation exists between the 2800 MHz 10.7 
cm solar radio flux observed by ground stations and 
the sun spot number as defined by Wolf has lon g been 
known ( Covington 1969t Hathaway et al. 2002 ). The 



correspondence of the 10.7 cm flux with other indicators 
of solar activ ity as well as mechanisms for its origin are 
discussed by Tapping and Detracev ( 1990h . That solar 



processes is now well established 
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1998 




Johnson 
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sunspot number provides our longest continuous record 
of its level. Putting the sunspot number onto a footing 
with physical units is of intrinsic interest to the solar 
theorist. 
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Building a mathematical model to describe the re- 
lation between two quantities of physical interest is a 
popular pastime, and deciding whether to accept or re- 
ject a model based on a quality of fit parameter is often 
done. However, the essential question is not "how well 
does this model fit the data" but rather "how much 
better does this model fit the data relative to another 
model." If a single model is all that is available, its qual- 
ity of fit is irrelevant, as no better idea has presented 
itself. In Bayesian analysis ( Sivial Il996h . it is the ratio 
of the integrated evidence evaluated at the parameters 
of best fit which determines the relative plausibility of 
the models under consideration. After evaluating the 
best fitting parameters, we will compare their evidence 
ratios to determine the most plausible model consis- 
tent with the data. The nonlinearity inherent in the 
definition of the Wolf index proves particularly hard to 
model. 



2 Data selection and previous models 

Often when comparing two independent sets of mea- 
surements, the choice of which data to use for ab- 
scissa and which for ordinate is not unambiguous. Here 
we will consider polynomial and power law models 
each with three parameters relating the international 
sunspot number provided by the World Data Ce nter for 
the Sunspot Index, Belgium ( SIDC-team 20081) . to the 
adjusted Penticton/ Ottawa 2800 MHz solar flux pro- 
vided by the National Research Council of Canada and 
available through the National Geophysical Data Cen- 
ter, NOAA, USA. The adjusted flux compensates for 
variation in the earth-sun distance. These data sets do 
not quote variance values, which must then be set to 
unity for equal weighting of each data value. 

Previous investigators have usually selected a poly- 
nomial model for the relation between the solar flux 
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Table 1 Previous models available online: Fzm and Fzhs are from I Zhao and Han ( 20081 ) and Fips and Rips are from 
the IPS unit of the Australian Bureau of Meteorology 



Model 


Parameters 






Fzhi 


= 60.1 + 0.932 R D 






FzH3 


= 65.2 + 0.633 R D + 3.76 xlO" 3 R 2 D - 


1.28xl0 -5 


Rd 


Fips 


= 67.0 + 0.572 R D + 3.31 xl0~ 3 R 2 D - 


9.13xl0~ 6 


Rd 


Rips 


= 1.61 F B - 5.37 xl0~ 3 F B + 1.38 xlO 


~ 5 F% , F B 


= F D - 67.0 
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Fig. 1 Comparison of polynomial models available online 
using yearly data values 



Fig. 2 Comparison of polynomial models available online 
using monthly data values 



Fd and sunspot number Rr>. The subscript D will 
be used to dist i nguish data values from model values. 
Zha o and Hani lj2008ll consider both a linear fit and 
a cubic fit for F{Rd) using annual values for 1947- 
2005, and the Ionospheric Prediction Servic e (IPS) unit 
of th e Australian Bureau of Meteorology (jThompsonl 
2010) gives cubic equations for both F{Rn) and R(Fd) 
using monthly values from 1947-1990. The radio 
flux is expressed in solar flux units (sfu) equal to 
10 _22 W/m 2 /Hz. These model equations, written in a 
form comparable to that which we will investigate, are 
displayed in Table [TJ 

A graphical comparison of these models using the 
yearly data values for 1947-2008 is given in Figure [TJ 
We see that the linear model is not capable of match- 
ing the data at low activity levels, and shortly beyond 
the region displayed the cubic models for F(Rr>) in- 
flect downwards, implying a saturation of radio flux at 
extreme levels of solar magnetic activity. The corre- 
sponding inverse relation Rips{Fd) does not so inflect 



and is dominated by the cubic term at high flux lev- 
els. These remarks hold as well for the monthly values 
shown in Figure [2] 



3 Bayesian data analysis 

Our implementation of Bayesian d ata analysis draws 
primarily on the text bv lSivial (|1996). The essential fea- 
ture which takes it beyond simple regression is the use 
of a non-uniform prior in appropriate circu mstances. 
Using the language of conditional probabilities ([Durrett 
Il994l) we write "the probability of A given B under 
conditions /" as 



W ob{A\B-I)=p{A\ I B)=pi 



(1) 



when the back ground informatio n / is unchanging. The 
choice of prior (|D'Agostinilll998i ) represents one's back- 
ground knowledge on the likely distribution of a pa- 
rameter x € [xq,xi] before analysis of the current set 
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of data. A non-uniform prior pj arises naturally in 
many contexts, often representing a prior which is uni- 
form over a change of variables x —> F for some inte- 
grable function f[x) = dF/dx, with normalization^ = 
A^/W for A x = Q f(x) dx such that Q p x f dx = 1. 
Besides the uniform prior f^ 1 = 1, one commonly en- 
counters the Jeffreys prior f~ l = x uniform over log a; 
and the Cauchy distribution f~ l = 1 + x 2 uniform over 
arctanx. 

3.1 Parameter estimation 

One states Bayes' theorem in the context of parameter 
estimation as 



pi=P x Px/p D 



(2) 



reading "the evidence for parameters X given data D 
equals the prior for X times the likelihood for D given 
X divided by the chance of measuring D" . What we 
call "the evidence" is often called "the posterior", as 
the normalization constant p D affecting neither param- 
eter estimation nor model selection is sometimes called 
"evidence" ; both "prior" and "likelihood" have their 
usual meaning. The logarithm (base e) of Equation [2] 
reads Le = Fp + Ll + Jj=D, where the final term is 
a constant equal to — \ogp D . For independent data 
D = {D t } indexed by t with Gaussian noise <r, the 
likelihood factors as p x = U t (^ 7T ' T t)' 1 ^ 2 ex P(~Xt / 2 ), 
where Xt = [Mt(X) — D t ]/a t is the weighted residual 
of the model M, so that Fe has one term proportional 
to the measure of fit x 2 = J2t Xt an d another which is 
constant. With the definition of the merit function in 
terms of the model parameters, 



L x = l °SP x + \x 2 

xex 



(3) 



the p roblem becomes o ne of nonlinear global optimiza- 
tion ([Press et al.l Il992l). seeking a unique solution to 
the equation VxFe = 0. Short of evaluating the merit 
function over the entire prior range, one must rely on in- 
tuition and luck to varying degrees. One's intuition, en- 
coded in the form and domain of the prior functions p x , 
contributes to the gradient of the log evidence VxFe 
in the limit of poor data VxLl 0, thereby improving 
the chances of success. 

Getting slightly ahead of ourselves, let us remark 
here that the traditional definition of the measure of 
fit x 2 i s the unnormalized sum of weighted residuals 
squared. Recognizing that x 2 represents the variance 
of the data relative to the model, we believe that the 



normalized sum of weighted residuals is a more appro- 
priate measure of fit, which one defines as 



X 



Y,Wt{X) - D t ] 2 d t 



(4) 



with the normalized weights dt = Uf 2 / ^2 t c t ~ 2 play- 
ing the role of the discrete measure factor. For N t 
data values with unit variance a t = 1, the normalized 
measure of fit reduces to x 2 — X 2 /^t- In the contin- 
uum limit — > J the measure factor is made appar- 
ent x 2 = f[M(t) - D(t)] 2 dt, and the normalizat ion is 
required so that the measure of fit is not dependent 
upon the sampling rate — for an infinite or continuous 
data set, the unnormalized x 2 must be infinite for any 
model which does not perfectly match the data. Re- 
placing M t with a single parameter model given by the 
weighted mean of the data D — ^ t D(t)(a^~ / a^ 2 ) 
reveals the relationship between the measure of fit and 
the variance of the data vector Xjy = °r>- The nor- 
malization has no effect on the location of the maxi- 
mum likelihood solution Xl while influencing the rela- 
tive weighting of likelihood and prior in the expression 
for the evidence, thereby shifting the maximal evidence 
solution Xe for non-uniform priors; it also affects the 
width of the error bars assigned to the parameter val- 
ues, as exp(-x 2 /2) = [exp(-x 2 /2)] 1/A,t . 



3.2 Model selection 

Given a single model, all one can do is estimate its best 
fitting parameters — the quality of fit is irrelevant be- 
yond its role in the likelihood p^ and its evidence may 
be normalized to unity. However, faced with a choice of 
models, Bayes' theorem allows one to compute their ev- 
idence ratio i?^ B , which reduces to the likelihood ratio 



r>AB _ Pd _ 



Pa — r>AB 
Pb 



(5) 



pI p b p d b Ip d 

where p A — p B indicates no prior preference for either 
model. The null hypothesis of "no relation" is sup- 
ported only when one can define a noise model, as the 
likelihood cannot be computed for a model B given only 
that M t (B) ^ M t {A). The likelihood for each model 
is the unnormalized integral of the evidence for its pa- 
rameters, 



D I D,X, V 

Pm = I Pm dx = 



X D , v 

PMPx,M dX ■ 



(6) 



and may be identified as the "integrated probability 
bump" over the model parameters X. There is an un- 
fortunate confusion of nomenclature in the literature 
because pfj appears both in the position of chance in 
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Table 2 Summary of prior functions f x and domains 
[a!o,iCi] for the various models M 



M 


X 


f* 1 

J X 


Yearly 

Xq X\ 


Monthly 

Xo Xl 




B 


1 


55 


75 


55 


75 


Fi 


A 


l + A 2 





2 





2 




C 


1 + C 2 


-0.01 


0.01 


-0.01 


0.01 




B 


1 


55 


75 


55 


75 


F 2 


A 


A 


0.05 


5 


0.05 


5 




C 


C 


0.05 


5 


0.05 


5 




B 


1 


55 


75 


55 


75 


Ri 


A 


l + A 2 





2 





2 




C 


l + C 2 


-0.01 


0.01 


-0.01 


0.01 




B 


1 


55 


69 


55 


66.5 


R 2 


A 


A 


0.05 


5 


0.05 


5 




C 


C 


0.05 


5 


0.05 


5 



Equation ([2]) and in the position of likelihood in Equa- 
tion ©. 

Under the quadratic approximation, generally ac- 
ceptable when the evidence is not severely truncated by 
the prior range, one can evaluate the integral analyti- 
cally to write the negative logarithm of the likelihood 

as 



-L 



1 



A fe 



(7) 



for X indexed by k and {hk} the eigenvalues of the in- 
verse of the variance matrix for the parameters Yik ^fe = 
dct E^ 1 , where the first two terms are the value of the 
merit function evaluated at its minimum and the re- 
mainder comprise the Occam factor accounting for the 
ratio of the width of the evidence Sjf to the prior vol- 
ume {Afc}. An additional parameter must provide not 
just a better fit but a significantly better fit in order 
for its plausibility to increase. With several models to 
choose from, the one with the lowest value of — is 
deemed the most plausible, with the preference factor 
given by the exponential of the difference between the 
(negative) log evidence for each. 



4 Evaluation of the models 

With two functional forms, polynomial and power law, 
and an arbitrariness to the selection of abscissa and 
ordinate, we consider a total of four models, two for 
F(R D ) and two for R(F D ). As the 10.7 cm flux is 
observed never to fall below some background level ~65 
sfu, we use parameter B for the background level in 
all models. Parameter A will be an amplitude, and 
parameter C will be either another amplitude or the 



Table 3 Yearly analysis results for best fitting parameters 
B, A, and C, their standard deviations a, the quality of fit 
VioLe and —Le, and the evidence ratio -Rfj 1 , for both the 
unnormalized and normalized log likelihood — Ll 





M 


B A C 
<jb ga oc 


ViqLe 
-L e 


R 21 
H E 


x 2 

2 


F x 


62.87 0.835 0.0005 
0.31 0.009 0.0001 


-9.02 
2216 


9.69 
xlO 11 


F 2 


64.98 0.582 1.0970 
0.39 0.021 0.0082 


-10.32 
2189 


x 2 

2 


F x 


63.01 0.830 0.0005 
2.46 0.072 0.0004 


-10.62 
44.23 


4.42 


F 2 


65.54 0.550 1.1105 
3.02 0.148 0.0640 


-10.84 
42.74 


x 2 

2 


Ri 


62.75 1.245 -0.0012 
0.25 0.010 0.0001 


-8.89 
2403 


2.04 

xlO 36 


R2 


67.01 0.402 1.1979 
0.27 0.011 0.0081 


-10.40 
2319 


x 2 

2 


Ri 


62.63 1.239 -0.0012 
1.98 0.077 0.0005 


-10.47 
47.59 


13.2 


R 2 


67.30 0.388 1.2082 
2.04 0.083 0.0621 


-11.07 
45.00 



exponent in the power law. Specifically, we consider 
the three parameter models given by 



F 2 

Ri 
Ri 



B + AR D + CR 2 D , 

B + (AR D f , 

A(F D — B) + C(F D - B) 2 

A-^Fd-B) 1 / , 



(8) 
(9) 
(10) 

(11) 



where Fe> and Rd are the data selected for the abscissa 
and the form of i?2 is chosen to compare directly its pa- 
rameters with those of F-2- We will be neglecting any 
influence from a lag between the solar flux and sunspot 
numbers (jWilson et al.lll987t ISparavignal 120081 ). Upon 
a visual inspection of the normalized monthly data se- 
ries, any lag appears to be negligible at that temporal 
resolution. 

We summarize our use of priors in Table [2] A uni- 
form prior is assigned to B whose domain is adjusted for 
model i?2, which requires B < mm{Fr>}. The Cauchy 
distribution serves as the prior for the amplitudes of 
the polynomial models, and for the power law models 
the Jeffreys prior is taken for A and C. The Jeffreys 
and Cauchy priors share the property that they may 
be used equally for the forward and inverse models of 
F 2 and R 2 . As p 1/A \dA~ 1 /dA\ = p A cx A^ 1 , one may 
substitute A = A^ 1 to write p A oc A^ 1 , and similarly 
for the Cauchy prior. Our results are not influenced 
greatly by the choice of priors, indicating that the fit is 
driven primarily by the likelihood. 
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1 4 16 63 250 1 4 16 63 250 




Fig. 3 Comparison of the best fitting solutions to the 
yearly data values using the normalized measure of fit x 2 

4.1 Yearly analysis 

The results for our analysis of the yearly data values 
are presented in Table |31 where the logarithm of the 
norm of the gradient at the solution Xe is headed by 
Vio-^b = \og 1Q \VxLE(XE)\ and the negative log of the 
integrated evidence by —Le- The normalization of the 
measure of fit is indicated in the first column, and the 
evidence ratio R^ 1 is in the last column. As B(x 2 , R2) is 
within 3<tb of its upper limit, a numerical evaluation of 
its integrated evidence is necessary, which differs from 
the approximate value by only a few percent. 

We see that the various models give slightly different 
estimates for the background radio flux B. The polyno- 
mial models Fi and R\ return a value of about 63 sfu, 
while the power law model values are higher, around 65 
sfu for F2 and 67 sfu for i?2- These remarks hold for 
either normalization of the measure of fit. We compare 
in Figure |3] the model solutions using the normalized 
measure of fit for all four models — the solutions for the 
unnormalized measure of fit are visually indistinguish- 
able. Compared to Figure[TJ one can see that the power 
law F2 provides with three parameters a quality of fit on 
par with a polynomial of four parameters and does not 
suffer from inflection problems at high levels of solar 
activity. Polynomial models are notorious for having 
difficulties with extrapolation. 

While the solution location Xe is not greatly in- 
fluenced by the choice of \ 2 or X 2 in Ll, the width 



Table 4 Monthly analysis results to compare with Table [3] 



-L L 


M 


BAG 

(JB OA OC 


VioLb 

—Le 


Re 


x 2 

2 


F\ 


60.72 0.900 0.0002 
0.09 0.003 0.0000 


-7.76 
92524 


4.46 
xlO 119 


F 2 


62.72 0.686 1.0642 
0.12 0.007 0.0023 


-8.45 
92249 


x 2 

2 


Fx 


60.87 0.895 0.0003 
2.49 0.071 0.0004 


-10.37 
134.49 


4.55 


F 2 


63.36 0.645 1.0780 
3.19 0.178 0.0615 


-11.71 
132.98 


x 2 

2 


Rx 


62.42 1.363 -0.0024 
0.06 0.002 0.0000 


-7.88 
80922 


~ 


R2 


66.50 0.281 1.3305 
0.00 0.001 0.0014 


-7.98 
82917 


x 2 

2 


Ri 


62.35 1.360 -0.0024 
1.67 0.061 0.0004 


-10.39 
119.51 


5.86 
xlO" 3 


R2 


66.50 0.279 1.3332 
0.01 0.028 0.0381 


-9.97 
124.65 



of the marginal error bars is greater when using the 
normalized variance. This change in the width of the 
evidence has a strong impact on the evaluation of its in- 
tegral through the Occam factor in Equation ([7]). Con- 
sequently, the evidence ratio R^ indicating the prefer- 
ence factor for the power law over the polynomial model 
is vastly different for the two choices of L^. With such 
similarity in the model solutions Xe{x 2 ) and Xe(x 2 )i 
it is hard for us to countenance a preference factor on 
the order of 10 36 or even 10 12 . Using the normalized 
model variance x 2 gives a preference factor ~ 10 for the 
power law models. Furthermore, it seems reasonable to 
expect the variance of the background estimate B for 
each model to be on the order of the variance between 
the models, as is found when using x 2 ■ 

4.2 Monthly analysis 

Repeating the analysis using the monthly data values, 
we find the results shown in Table |U While the assess- 
ment of the models for F(Re>) is consistent with that of 
the yearly data, here we find that the power law model 
for R(Fd) is not to be preferred. The reason is because 
the background parameter B is very tightly constrained 
to a value just below the minimum of the abscissa data 
Fd . One might consider a modification of the model so 
that R2{Fq < B) = 0; however, such approach poses 
difficulties with the analytic evaluation of the gradient 
of the log likelihood. The estimates of the background 
for the models F(Rd) are lower compared to those from 
the yearly data, while those for i?i are about the same, 
as are the remainder of the parameters. We display the 
model solutions for the monthly data in Figure |4] 

As we are most interested in ascribing to the his- 
torical record of sunspot activity a physical unit based 
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Fig. 4 Comparison of the best fitting solutions to the 
monthly data values using the normalized measure of fit 

x 2 



on the solar radio flux, the consistency in preference 
for i<2 to Fi indicates the power law model function 
may be used for either yearly or monthly analysis. Us- 
ing ip and n to indicate the yearly and monthly solu- 
tions, we are tempted to compare boxes of apples to 
apples by looking at the difference between LsiF^) 
and LsiF^), made possible through the use of the nor- 
malized measure of fit x 2 . Reading the values from 
the tables, one can state that F2 fits the yearly data 
better than the monthly data by a factor of about 
exp(133 — 42.7) ~ 10 39 . Continuing the analogy to 
models for apples and bananas, one finds that fits 
better than R% by a factor exp(45 — 42.7) ~ 10. 



both the quality of fit and the error bars on the param- 
eters given by the determinant of the inverse variance 
matrix. The consequence is that for models with a sim- 
ilar measure of fit and prior volume, probability theory 
actually prefers the one with the larger error bars. The 
reason is because a greater range of its parameter space 
yields a model consistent with the data. 

Concluding, we have considered various models of 
three parameters relating the 10.7 cm solar radio flux 
to the international sunspot number. The parameters 
found using maximal evidence are consistent with those 
given by other investigators. Model selection using the 
evidence ratio indicates that the power law determining 
the solar flux from the sunspot number is most consis- 
tent with the yearly data values. That model may be 
used to ascribe to the historical sunspot record a value 
in solar flux units. 
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5 Discussion and Conclusions 

The primary difficulty the models face is in relating the 
international sunspot number derived from the original 
Wolf index to the physical flux measurements of the 
S-component oscillation at small magnitudes. It stems 
from the behavior of Rr> = ki(10g + s) for small values 
of spot and group numbers s and g. The Wolf index 
has a jump from to 11 for the first spot observed, and 
while the modern discontinuity is reduced slightly by 
the international reduction coefficient it still repre- 
sents a significant source of nonlinearity. 

An interesting feature of Bayesian model selection is 
that the log evidence, Equation j7]), contains factors for 
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